<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
                "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
  <title>Description of v_activlevg</title>
  <meta name="keywords" content="v_activlevg">
  <meta name="description" content="V_ACTIVLEVG Measure active speech level robustly [LEV,AF,FSO]=(sp,FS,MODE)">
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  <meta name="generator" content="m2html &copy; 2003 Guillaume Flandin">
  <meta name="robots" content="index, follow">
  <link type="text/css" rel="stylesheet" href="../m2html.css">
</head>
<body>
<a name="_top"></a>
<div><a href="../index.html">Home</a> &gt;  <a href="index.html">v_mfiles</a> &gt; v_activlevg.m</div>

<!--<table width="100%"><tr><td align="left"><a href="../index.html"><img alt="<" border="0" src="../left.png">&nbsp;Master index</a></td>
<td align="right"><a href="index.html">Index for v_mfiles&nbsp;<img alt=">" border="0" src="../right.png"></a></td></tr></table>-->

<h1>v_activlevg
</h1>

<h2><a name="_name"></a>PURPOSE <a href="#_top"><img alt="^" border="0" src="../up.png"></a></h2>
<div class="box"><strong>V_ACTIVLEVG Measure active speech level robustly [LEV,AF,FSO]=(sp,FS,MODE)</strong></div>

<h2><a name="_synopsis"></a>SYNOPSIS <a href="#_top"><img alt="^" border="0" src="../up.png"></a></h2>
<div class="box"><strong>function [lev,xx] = v_activlevg(sp,fs,mode) </strong></div>

<h2><a name="_description"></a>DESCRIPTION <a href="#_top"><img alt="^" border="0" src="../up.png"></a></h2>
<div class="fragment"><pre class="comment">V_ACTIVLEVG Measure active speech level robustly [LEV,AF,FSO]=(sp,FS,MODE)

Inputs: sp     is the speech signal
        FS     is the sample frequency in Hz (see also FSO below)
        MODE   is a combination of the following:
               r - raw omit input filters (default is 200 Hz to 5.5 kHz)
               0 - no high pass filter (i.e. include DC)
               4 - high pass filter at 40 Hz (but allows mains hum to pass)
               1 - use cheybyshev 1 filter
               2 - use chebyshev 2 filter (default)
               e - use elliptic filter
               h - omit low pass filter at 5.5 kHz
               d - give outputs in dB rather than power
               n - output a normalized speech signal as the first argument
               N - output a normalized filtered speech signal as the first argument
               l - give additional power level estimates (see below for details)
               a - include A-weighting filter
               i - include ITU-R-BS.468/ITU-T-J.16 weighting filter

Outputs:
    If the &quot;n&quot; option is specified, a speech signal normalized to 0dB will be given as
    the first output followed by the other outputs.
        LEV    gives the speech level in units of power (or dB if mode='d')
               if mode='l' is specified, LEV is a row vector containing:
                       [ active-level mean-power mean-noise-power P.56-level harmonic-power-level]</pre></div>

<!-- crossreference -->
<h2><a name="_cross"></a>CROSS-REFERENCE INFORMATION <a href="#_top"><img alt="^" border="0" src="../up.png"></a></h2>
This function calls:
<ul style="list-style-image:url(../matlabicon.gif)">
<li><a href="v_activlev.html" class="code" title="function [lev,af,fso,vad]=v_activlev(sp,fs,mode)">v_activlev</a>	V_ACTIVLEV Measure active speech level as in ITU-T P.56 [LEV,AF,FSO]=(sp,FS,MODE)</li><li><a href="v_estnoiseg.html" class="code" title="function [x,zo]=v_estnoiseg(yf,tz,pp)">v_estnoiseg</a>	V_ESTNOISEG - estimate MMSE noise spectrum [x,zo]=(yf,tz,pp)</li><li><a href="v_fxpefac.html" class="code" title="function [fx,tx,pv,fv]=v_fxpefac(s,fs,tinc,m,pp)">v_fxpefac</a>	V_FXPEFAC PEFAC pitch tracker [FX,TT,PV,FV]=(S,FS,TINC,M,PP)</li><li><a href="v_spgrambw.html" class="code" title="function [t,f,b]=v_spgrambw(s,fs,varargin)">v_spgrambw</a>	V_SPGRAMBW Draw spectrogram [T,F,B]=(s,fs,mode,bw,fmax,db,tinc,ann)</li><li><a href="v_stdspectrum.html" class="code" title="function [b,a,si,sn]=v_stdspectrum(s,m,f,n,zi,bs,as)">v_stdspectrum</a>	V_STDSPECTRUM Generate standard acoustic/speech spectra in s- or z-domain [B,A,SI,SN]=(S,M,F,N,ZI,BS,AS)</li></ul>
This function is called by:
<ul style="list-style-image:url(../matlabicon.gif)">
</ul>
<!-- crossreference -->


<h2><a name="_source"></a>SOURCE CODE <a href="#_top"><img alt="^" border="0" src="../up.png"></a></h2>
<div class="fragment"><pre>0001 <a name="_sub0" href="#_subfunctions" class="code">function [lev,xx] = v_activlevg(sp,fs,mode)</a>
0002 <span class="comment">%V_ACTIVLEVG Measure active speech level robustly [LEV,AF,FSO]=(sp,FS,MODE)</span>
0003 <span class="comment">%</span>
0004 <span class="comment">%Inputs: sp     is the speech signal</span>
0005 <span class="comment">%        FS     is the sample frequency in Hz (see also FSO below)</span>
0006 <span class="comment">%        MODE   is a combination of the following:</span>
0007 <span class="comment">%               r - raw omit input filters (default is 200 Hz to 5.5 kHz)</span>
0008 <span class="comment">%               0 - no high pass filter (i.e. include DC)</span>
0009 <span class="comment">%               4 - high pass filter at 40 Hz (but allows mains hum to pass)</span>
0010 <span class="comment">%               1 - use cheybyshev 1 filter</span>
0011 <span class="comment">%               2 - use chebyshev 2 filter (default)</span>
0012 <span class="comment">%               e - use elliptic filter</span>
0013 <span class="comment">%               h - omit low pass filter at 5.5 kHz</span>
0014 <span class="comment">%               d - give outputs in dB rather than power</span>
0015 <span class="comment">%               n - output a normalized speech signal as the first argument</span>
0016 <span class="comment">%               N - output a normalized filtered speech signal as the first argument</span>
0017 <span class="comment">%               l - give additional power level estimates (see below for details)</span>
0018 <span class="comment">%               a - include A-weighting filter</span>
0019 <span class="comment">%               i - include ITU-R-BS.468/ITU-T-J.16 weighting filter</span>
0020 <span class="comment">%</span>
0021 <span class="comment">%Outputs:</span>
0022 <span class="comment">%    If the &quot;n&quot; option is specified, a speech signal normalized to 0dB will be given as</span>
0023 <span class="comment">%    the first output followed by the other outputs.</span>
0024 <span class="comment">%        LEV    gives the speech level in units of power (or dB if mode='d')</span>
0025 <span class="comment">%               if mode='l' is specified, LEV is a row vector containing:</span>
0026 <span class="comment">%                       [ active-level mean-power mean-noise-power P.56-level harmonic-power-level]</span>
0027 
0028 <span class="comment">% This is an implementation of the algorithm described in [1].</span>
0029 <span class="comment">%</span>
0030 <span class="comment">% Refs:</span>
0031 <span class="comment">%    [1] S. Gonzalez and M. Brookes.</span>
0032 <span class="comment">%        Speech active level estimation in noisy conditions.</span>
0033 <span class="comment">%        In Proc. IEEE Intl Conf. Acoustics, Speech and Signal Processing,</span>
0034 <span class="comment">%        pp 6684�6688, Vancouver, May 2013. doi: 10.1109/ICASSP.2013.6638955.</span>
0035 
0036 <span class="comment">%      Copyright (C) Sira Gonzalez, Mike Brookes 2008-2012</span>
0037 <span class="comment">%      Version: $Id: v_activlevg.m 10865 2018-09-21 17:22:45Z dmb $</span>
0038 <span class="comment">%</span>
0039 <span class="comment">%   VOICEBOX is a MATLAB toolbox for speech processing.</span>
0040 <span class="comment">%   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html</span>
0041 <span class="comment">%</span>
0042 <span class="comment">%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%</span>
0043 <span class="comment">%   This program is free software; you can redistribute it and/or modify</span>
0044 <span class="comment">%   it under the terms of the GNU General Public License as published by</span>
0045 <span class="comment">%   the Free Software Foundation; either version 2 of the License, or</span>
0046 <span class="comment">%   (at your option) any later version.</span>
0047 <span class="comment">%</span>
0048 <span class="comment">%   This program is distributed in the hope that it will be useful,</span>
0049 <span class="comment">%   but WITHOUT ANY WARRANTY; without even the implied warranty of</span>
0050 <span class="comment">%   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the</span>
0051 <span class="comment">%   GNU General Public License for more details.</span>
0052 <span class="comment">%</span>
0053 <span class="comment">%   You can obtain a copy of the GNU General Public License from</span>
0054 <span class="comment">%   http://www.gnu.org/copyleft/gpl.html or by writing to</span>
0055 <span class="comment">%   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.</span>
0056 <span class="comment">%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%</span>
0057 <span class="keyword">persistent</span> ro snr_ro offset c25zp c15zp e5zp
0058 <span class="keyword">if</span> isempty(ro)
0059     ro = [0,0.124893,0.160062,0.212256,0.279864,0.351281,0.437356,<span class="keyword">...</span>
0060         0.550314,0.680075,0.795741,0.892972,0.939087,1];  <span class="comment">%  mapping from SNR to rho</span>
0061     snr_ro = -2:0.5:4;
0062     offset = 0.8472; <span class="comment">% correction factor for harmonic power in dB</span>
0063     c25zp=[0.37843443673309i 0.23388534441447i; -0.20640255179496+0.73942185906851i -0.54036889596392+0.45698784092898i];
0064     c25zp=[[0; -0.66793268833792] c25zp conj(c25zp)];
0065     <span class="comment">%       [c1z,c1p,c1k] = cheby1(5,0.25,1,'high','s');</span>
0066     c15zp=[-0.659002835294875+1.195798636925079i -0.123261821596263+0.947463030958881i];
0067     c15zp=[zeros(1,5); -2.288586431066945 c15zp conj(c15zp)];
0068     <span class="comment">%      [ez,ep,ek] = ellip(5,0.25,50,1,'high','s')</span>
0069     e5zp=[0.406667680649209i 0.613849362744881i; -0.538736390607201+1.130245082677107i -0.092723126159100+0.958193646330194i];
0070     e5zp=[[0; -1.964538608244084]  e5zp conj(e5zp)];
0071     <span class="comment">%    w=linspace(0.2,2,100);</span>
0072     <span class="comment">%    figure(1); plot(w,20*log10(abs(freqs(real(poly(c15zp(1,:))),real(poly(c15zp(2,:))),w)))); title('Chebyshev 1');</span>
0073     <span class="comment">%    figure(2); plot(w,20*log10(abs(freqs(real(poly(c25zp(1,:))),real(poly(c25zp(2,:))),w)))); title('Chebyshev 2');</span>
0074     <span class="comment">%    figure(3); plot(w,20*log10(abs(freqs(real(poly(e5zp(1,:))),real(poly(e5zp(2,:))),w)))); title('Elliptic');</span>
0075 <span class="keyword">end</span>
0076 <span class="keyword">if</span> nargin&lt;3
0077     mode=<span class="string">' '</span>;
0078 <span class="keyword">end</span>
0079 fso.ffs=fs;                           <span class="comment">% sample frequency</span>
0080 <span class="keyword">if</span> any(mode==<span class="string">'r'</span>)                   <span class="comment">% included for backward compatibility</span>
0081     mode=[<span class="string">'0h'</span> mode];               <span class="comment">% abolish both filters</span>
0082 <span class="keyword">elseif</span> fs&lt;14000
0083     mode=[<span class="string">'h'</span> mode];               <span class="comment">% abolish lowpass filter at low sample rates</span>
0084 <span class="keyword">end</span>
0085 <span class="comment">% s-plane zeros and poles of high pass 5'th order filter -0.25dB at w=1 and -50dB stopband</span>
0086 <span class="keyword">if</span> any(mode==<span class="string">'1'</span>)
0087     szp=c15zp;            <span class="comment">% Chebyshev 1</span>
0088 <span class="keyword">elseif</span> any(mode==<span class="string">'e'</span>)
0089     szp=e5zp;             <span class="comment">% Elliptic</span>
0090 <span class="keyword">else</span>
0091     szp=c25zp;            <span class="comment">% Chebyshev 2</span>
0092 <span class="keyword">end</span>
0093 <span class="comment">% calculate high pass filter at 40 or 200 Hz</span>
0094 <span class="keyword">if</span> all(mode~=<span class="string">'0'</span>)
0095     <span class="keyword">if</span> any(mode==<span class="string">'4'</span>)
0096         fl=40;               <span class="comment">% 40 Hz cutoff</span>
0097     <span class="keyword">else</span>
0098         fl=200;              <span class="comment">% 200 Hz cutoff</span>
0099     <span class="keyword">end</span>
0100     zl=2./(1-szp*tan(fl*pi/fs))-1;      <span class="comment">% 40 or 200 Hz LF limit</span>
0101     al=real(poly(zl(2,:)));              <span class="comment">% high pass filter</span>
0102     bl=real(poly(zl(1,:)));
0103     sw=(-1).^(0:5)';                    <span class="comment">%z^(-n) for z=-1</span>
0104     bl=bl*(al*sw)/(bl*sw);           <span class="comment">% scale to give HF gain of 1</span>
0105 <span class="keyword">end</span>
0106 <span class="comment">% calculate low pass filter at 5500 Hz</span>
0107 <span class="keyword">if</span> all(mode~=<span class="string">'h'</span>)
0108     zh=2./(szp/tan(5500*pi/fs)-1)+1;
0109     ah=real(poly(zh(2,:)));
0110     bh=real(poly(zh(1,:)));
0111     bh=bh*sum(ah)/sum(bh);
0112 <span class="keyword">end</span>
0113 <span class="keyword">if</span> any(mode==<span class="string">'a'</span>)
0114     [bw aw]=<a href="v_stdspectrum.html" class="code" title="function [b,a,si,sn]=v_stdspectrum(s,m,f,n,zi,bs,as)">v_stdspectrum</a>(2,<span class="string">'z'</span>,fs);
0115 <span class="keyword">elseif</span> any(mode==<span class="string">'i'</span>)
0116     [bw aw]=<a href="v_stdspectrum.html" class="code" title="function [b,a,si,sn]=v_stdspectrum(s,m,f,n,zi,bs,as)">v_stdspectrum</a>(8,<span class="string">'z'</span>,fs);
0117 <span class="keyword">end</span>
0118 <span class="comment">% apply the input filters to the speech</span>
0119 <span class="keyword">if</span> all(mode~=<span class="string">'0'</span>)
0120     sq=filter(bl,al,sp(:));     <span class="comment">% highpass filter</span>
0121 <span class="keyword">else</span>
0122     sq=sp(:);
0123 <span class="keyword">end</span>
0124 <span class="keyword">if</span> all(mode~=<span class="string">'h'</span>)
0125     sq=filter(bh,ah,sq(:));     <span class="comment">% lowpass filter</span>
0126 <span class="keyword">end</span>
0127 <span class="keyword">if</span> any(mode==<span class="string">'a'</span>) || any(mode==<span class="string">'i'</span>)
0128     sq=filter(bw,aw,sq(:));     <span class="comment">% weighting filter</span>
0129 <span class="keyword">end</span>
0130 p56 = <a href="v_activlev.html" class="code" title="function [lev,af,fso,vad]=v_activlev(sp,fs,mode)">v_activlev</a>(sq,fs,<span class="string">'0hl'</span>);            <span class="comment">% get active level from P.56 method (no extra filters)</span>
0131 [fx,dum,pv]=<a href="v_fxpefac.html" class="code" title="function [fx,tx,pv,fv]=v_fxpefac(s,fs,tinc,m,pp)">v_fxpefac</a>(sp,fs);              <span class="comment">% Estimate f0 and voiced probability from unfiltered speech</span>
0132 [tz,f,S]=<a href="v_spgrambw.html" class="code" title="function [t,f,b]=v_spgrambw(s,fs,varargin)">v_spgrambw</a>(sq,fs,20,[0 5 fs/2],[],0.01);   <span class="comment">% spectrogram with 20 Hz bandwidth and 5 Hz/10 ms resolution</span>
0133 
0134 nh = 15;         <span class="comment">% Number of harmonics to evaluate</span>
0135 tv = find(pv&gt;0.5);  <span class="comment">% Find voiced frames</span>
0136 nv=length(tv);      <span class="comment">% number of voiced frames</span>
0137 ran = -60:60;       <span class="comment">% Frequency range of the mexican hat (+/- 60Hz)</span>
0138 nm = length(ran);   <span class="comment">% width of mexican hat</span>
0139 o = 15;             <span class="comment">% semi-width of central lobe</span>
0140 mexic = (1-(ran/o).^2).*exp(-(ran.^2)/(2*o.^2)); <span class="comment">% Mexican hat wavelet</span>
0141 <span class="comment">%% calculate harmonic energy in voiced frames</span>
0142 Et=zeros(nv,1);
0143 <span class="keyword">for</span> jv =1:nv                                    <span class="comment">% loop for each voiced frame</span>
0144     f_har = fx(tv(jv)).*(1:nh).';               <span class="comment">% Calculate frequencies of harmonics</span>
0145     lev = repmat(f_har,1,nm)+repmat(ran,nh,1);  <span class="comment">% frequencies to test: nh x nm</span>
0146     data = interp1(f,S(tv(jv),:),lev,<span class="string">'spline'</span>); <span class="comment">% interpolate spectrogram onto required frequencies</span>
0147     Eh = sum(data.*repmat(mexic,nh,1),2);       <span class="comment">% estimate power in each harmonic</span>
0148     Et(jv) = sum(Eh(Eh&gt;0));                     <span class="comment">% sum the non-negative harmonic powers</span>
0149 <span class="keyword">end</span>
0150 har = mean(Et);                                 <span class="comment">% mean power of voiced frames (not including offset)</span>
0151 <span class="comment">%% Noise estimate</span>
0152 dp=<a href="v_estnoiseg.html" class="code" title="function [x,zo]=v_estnoiseg(yf,tz,pp)">v_estnoiseg</a>(S,tz(2)-tz(1));                        <span class="comment">% estimate noise power spectrum</span>
0153 dpm=mean((f(2)-f(1))*sum(dp,2));                    <span class="comment">% mean noise psd</span>
0154 snr_est=10*log10(har/dpm);  <span class="comment">% estimate global SNR (should this be restricted to voiced frames?)</span>
0155 <span class="comment">%% Combine methods</span>
0156 
0157 levp56 = 10*log10(p56(1));
0158 levharm = 10*log10(har)+offset;
0159 <span class="keyword">if</span> snr_est&lt;=-2
0160     lev = levharm;
0161 <span class="keyword">elseif</span> snr_est&gt;4
0162     lev = levp56;
0163 <span class="keyword">else</span>
0164     ro1 = interp1(snr_ro,ro,snr_est);               <span class="comment">% interpolate the value of rho</span>
0165     lev = ro1*levp56 + (1-ro1)*levharm;
0166 <span class="keyword">end</span>
0167 levdb=lev;
0168 <span class="keyword">if</span> any(mode==<span class="string">'l'</span>)
0169 <span class="comment">%                       [ active-level mean-power mean-noise-power P.56-level mean-harmonic-power]</span>
0170     lev=[levdb 10*log10(p56(2)) 10*log10(dpm) levp56 levharm];
0171 <span class="keyword">end</span>
0172 <span class="keyword">if</span> all(mode~=<span class="string">'d'</span>)
0173     lev=10.^(0.1*lev);
0174 <span class="keyword">end</span>
0175 <span class="keyword">if</span> any(mode==<span class="string">'n'</span>) || any(mode==<span class="string">'N'</span>)
0176     xx=lev;
0177     <span class="keyword">if</span> any(mode==<span class="string">'n'</span>)
0178         sq=sp;
0179     <span class="keyword">end</span>
0180     <span class="keyword">if</span> dpm&gt;0
0181         lev=sq*10^(-0.05*levdb);
0182     <span class="keyword">else</span>
0183         lev=sq;
0184     <span class="keyword">end</span>
0185 <span class="keyword">end</span></pre></div>
<hr><address>Generated by <strong><a href="http://www.artefact.tk/software/matlab/m2html/">m2html</a></strong> &copy; 2003</address>
</body>
</html>